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Abstract 

In this paper we are concerned with non-parametric inference on the volatility of 
volatility process r 2 in stochastic volatility models. We construct an estimator for its 
integrated version j Q r^ds in a high frequency setting which is based on increments of 
spot volatility estimators, and we are able to prove both feasible and infeasible central 
limit theorems at the optimal rate n^ 1 / 4 . Such CLTs can be widely used in practice, 
as they are the key to essentially all tools in model validation for stochastic volatility 
models. As an illustration we apply our results to goodness-of-fit testing, providing 
the first consistent test for a certain parametric form of the volatility of volatility. 
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1 Introduction 

Nowadays, stochastic volatility models are standard tools in the continuous-time modelling 
of financial time series. Typically, the underlying (log) price process is assumed to follow 
a diffusion process of the form 



where /i and a can be quite general stochastic processes themselves. A classical case is 
where the volatility a 2 = a 2 (s,X s ) is a function of time and state-a situation referred to 
as the one of a local volatility model. It has turned out in empirical finance that such 
models do not fit the data very well, as some stylised facts such as the leverage effect or 
volatility clustering cannot be explained using local volatility only. Stochastic volatility 
models, however, are able to reproduce such features, as they bear an additional source of 
randomness. In these models the volatility process is a diffusion process itself, having the 
representation 



where v and r again are suitable stochastic processes and V is another Brownian motion, 
correlated with W. 

Standard stochastic volatility models are parametric ones, and probably the prime 
example among those is the Heston model of |17| . given by 



for some parameters (3, k, a and £, and with Corr(W, V) = p. Here, the volatility process 
follows a Cox-Ingersoll-Ross model, that means it is mean-reverting with mean a and speed 
k, and both diffusion coefficients are proportional with parameter £. Such a behaviour 
appears to be rather typical for stochastic volatility models, and in this sense the Heston 
model can be regarded as prototypic. Popular alternatives are for example coming from the 
more general (but again parametric) class of CEV models, where the diffusion coefficient r 
becomes a power function of a, whereas the drift part of the volatility remains in principle 
the same. See [25] for a survey. 

For this reason, statistical inference for stochastic volatility models has focused on 
parametric methods for most times, and usually the authors provide tools for a specific 
class of models. However, one is faced with two severe problems: First, it is in most 
cases impossible to assess the distribution of X (or its increments), which makes standard 
maximum likelihood theory unavailable. Second, the volatility process a 2 is not observable, 
and many statistical concepts have in common that they propose to reproduce the unknown 
volatility process from observed option prices, typically by using proxies based on implied 
volatility. A survey on early estimation methods in this context can be found in jllj. 
One remarkable exception where stock price data only is used, is the paper of [10] who 
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construct a GMM estimator for the parameters of the Heston model from increments 
of realised variance. But also in a general setting with no specific model in mind, the 
focus has been on parametric approaches. An early approach on parameter estimation 
when a 2 is ergodic is the work of [15], optimal rates are discussed in [18] and [16], and a 
maximum likelihood approach based on proxies for the volatility can be found in [4] . Even 
non-parametric concepts have been used to identify parameters of a stochastic volatility 
model, see for example [9] or [30] . 

Genuine non-parametric inference for stochastic volatility models has typically focused 
on function estimation. Both [25] and [12] discuss techniques for the estimation of / and g, 
when the volatility process satisfies da 2 = f (a 2 )dt + g(a 2 )dVt. In the more general model- 
free context of (|1.2|) only [8] have discussed estimation of functionals of the process r by 
providing a consistent estimator for the integrated volatility of volatility Jjj r 2 ds. Their 
approach is inspired by the asymptotic behaviour of realised variance, which states that 
the sum of squared increments of a 2 converges in probability to the quantity of interest. 
Since a 2 is not observable, the authors use spot volatility estimators instead. 

We will pursue their approach and define a slightly different estimator for integrated 
volatility of volatility which attains the optimal rate of convergence in this context, also 
using observations of X only. Furthermore, a stable central limit theorem is provided, and 
by defining appropriate estimators for the asymptotic (conditional) variance we obtain 
a feasible version as well. The latter result is of theoretical interest on one hand, but is 
extremely important from an applied point of view as well, as it makes model validation for 
stochastic volatility models possible. Given the tremendous number of such models with 
entirely different qualitative behaviours, there is a lack of techniques that help deciding 
whether a certain model fits the data appropriately or not. 

As a first result on model validation in this whole framework we give an example on 
goodness-of-fit testing, but our method is by no means limited to it. Related procedures 
can be used to test e.g. whether a Brownian component or jumps are present in the 
volatility process and what in general the structure of the jump part is. Such problems 
have been solved for the price process X in recent years (see [22J for an overview), and in 
principle the methods are all based on the estimation of plain integrated volatility Jq a 2 ds 
and further quantities. Using our main results, they can be translated to the stochastic 
volatility case by using integrated volatility of volatility instead. See the conclusions in 
Section [5] for some hints on further research. 

Finally, the paper is organised as follows: In Section [2] we introduce our estimator and 
state the two central limit theorems, whereas Section [3] presents a method for goodness- 
of-fit testing in stochastic volatility models. Some Monte Carlo results can be found in 
Section^ and as noted before we give a conclusion in Section Most proofs can be found 
in the Appendix, which is Section [6l 

2 Main results 

Suppose that the process X is given by (jl.ip . where TV is a standard Brownian motion 
and the drift process \i is left continuous. We assume further that the volatility process a 2 
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is a continuous Ito semimartingale itself, having the representation (II. 2p . where v is left 
continuous as well and (W, V) are jointly Brownian with correlation parameter p 6 [—1, 1]. 
Note that \p\ = 1 corresponds to W = V almost surely, in which case we are essentially 
in the setting of a local volatility model, whereas \p\ < 1 refers to the genuine stochastic 
volatility case. Our aim is to draw inference on the integrated volatility of volatility, which 
is Jq Tgds. To this end we impose a regularity condition on r, namely 

T 2 s = TO + f 0J 3 ds + I V^dWs + f $ {2) dV s + f V^dWs, (2.1) 

Jo Jo Jo Jo 

where W is another Brownian motion, independent of (W, V), ui is locally bounded and 
each #W is left continuous, I = 1,2, 3. Finally, we assume that all processes are defined on 
the same probability space (Q, J 7 , (J-t)t>o^) an d that all coefficients are specified in such 
a way that a 2 and r 2 are almost surely positive. These assumptions are all rather mild 
and are covered by a variety of (stochastic) volatility models used. 

Any statistical inference in this work will be based on high-frequency observations of 
X, and we assume that the data is recorded at equidistant times. Thus, without loss of 
generality let the process be defined on the interval [0,1] and observed at the time points 
i/n, i = 0, ...,n. Just as standard integrated volatility is estimated using (squared) 
increments of X, a reasonable estimator for integrated volatility of volatility can be built 
upon increments of a 2 . These are in general not observable, so a proxy for them is needed. 
Since we are in a model-free world, a natural estimator for spot volatility a 2 ^ n is given by 

3=1 

for some auxiliary (integer- valued) sequence k n and where we have set A™Z = Z i / n — Z i _ 1 / n 
for any process Z. See [5] or |30] for details on the asymptotic behaviour of this estimator. 
Equation (|2.9p , which is a simple consequence of Ito formula and Lemma 16. 1\ shows later 
on that 

cr\ - <t| = O p (yJk n /n + y/l/k n ) 

n n 

holds, which explains the choice k n = cn 1 / 2 + o(n 1 / 4 ) for some c > we will use in the 
following. 

An estimator for integrated volatility of volatility will now be defined as a sum over 
squared increments of & 2 i+l w n — <5f/ n - What is a reasonable choice for l n l (|2.2p suggests 
that one should not take l n of smaller order than k n , as otherwise the estimation error 
^f/n — °f/n ( wm ch is of order \Jk n /n) is dominating the quantity of interest & 2 i+l n j n — a 2 , n 
(which has order ^Jl n /n). We will see that we can indeed take l n equal to k n which 
guarantees convergence at the optimal rate, but in this case a bias correction becomes 
necessary. For this reason we define local estimators for the process r 2 as follows: With a 
slight abuse of notation we set 
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where the latter is in general different from (<5"Jy n ) 2 - A global estimator for integrated 
volatility of volatility then becomes 

\nt\ -2k n 

Vt = ~ £ r|. (2.3) 
Its asymptotic behaviour is discussed in the following theorem. 



Theorem 2.1 Under the above assumptions we have the central limit theorem 




for all t > 0, where the limiting variable has the representation 

/"* mrf 2 4 8 8 12 4 2 151 4 

U t = / a s c(^, a s = —a s + — a s r s + —t s , 

JO C C IV 

W' is a Brownian motion defined on an extension of the original probability space and 
independent of T and the convergence in \2.1$ is T -stable in law. For details on this 
mode of convergence see for example 



Proof of Theorem 12.11 We will give a sketch of the proof here and relegate some 
tedious details to the Appendix. In general, Testable convergence of a sequence Z n to some 
limiting variable Z defined on an extension (O, J-, P) of the original space is equivalent to 

E(h(Z n )Y) -> E(h(Z)Y) (2.5) 

for any bounded Lipschitz function h and any bounded T measurable Y. For details, see 
for example |23j and related work. Suppose now that there are additional variables Z UiP 
and Z p (the latter defined on the same extension as Z) such that 

lim limsupE|Z ra — Z nj) \ = 0, (2-6) 

Z n ,p Zp for all p, (2.7) 

lim E\Z P -Z\=0, (2.8) 

C— (s) 

hold. Then the desired stable convergence Z n — > Z follows. Indeed, let e > 0. Then 
there exists a 5 > such that |x — y\ < 5 implies \h(x) — h(y)\ < e. Thus we have 

\E(h(Z n )Y) -E(h(Z n>p )Y)\ 

< c(E(\h(Z n ) - h(Z n , p )\l {lZn _ Znpl > s} ) +E(\h(Z n ) - h(Z n , p )\l {lZn _ Znpl<s 

< c(p(\Z n -Z n>p \>5) + e 
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for a generic C > 0. From Markov inequality, (12. 6h and as e was arbitrary, we have 
lim limsup \E(h(Z n )Y) - E(h(Z ntP )Y)\ = 0. 



The same argument using (12. 8j) yields 

lim |E(/i(Z p )y) -E(h(Z)Y)\ = 0, 

and (12. 7ft is by definition equivalent to 

lim |E(/t(Z np )y) -E(/i(Z p )r)| = 0. 

n— too 

Putting the latter three claims together [plus the triangle inequality and the fact that all 
three limiting conditions on p and n are actually the same] gives (j2.5|) . 

Our aim in this proof is to employ a certain blocking technique, which allows us to 
make use of a type of conditional independence between the estimators t?>. To this end we 
apply the above methodology, so we have to define an appropriate double sequence U^ ,p , 
which will correspond to the sum of approximated versions of f^ n over the big blocks. 
Some additional notation is necessary. First of all, recall that Ito formula gives 

fen />i±i - i+kn 

o\ = f - V 2 / " (X s - X i+ ,-i)dX s + f - / a 2 s ds =: A? + Bf . (2.9) 

J = i n n 

The main part of f7" is some functional of increments of A and B, and as noted above we 
need certain approximations for these in the sequel. Let p E N be arbitrary. We set 

a t (p) = (I- l)(p + 2)fc n , blip) = a e (p) + pk n , c(p) = J n (p)(p + 2)k n + l, 

the first two for any t = 1, . . . , J n (p) with J n (p) = [[nt — 2k n \ /(p + 2)/c n J . These numbers 
depend on n as well, even though it does not show up in the notation. We define further 
H i = Pi-i/n^s ~ W {i _ 1)/n )dW s and 

Ai+k„ - Ai = 2(J X(Pl ( H i+j+k n ~ 

n n run ■ , n 

J = l 

= f^Ef^n+i^) 2 -^^) 2 )' ( 2 - 10 ) 

where the latter identity is a consequence of Ito formula, and 

. i + k ri 



~ ~ Tl f " 

B i+k n — B± = — t a e ( P ) (V , kn — V s )ds. 

n n K n J ± n " r n 



These quantities are defined over the big blocks, that is for i = ai(p), . . . , bg(p) — 1. Up to a 
different standardisation, the role of Z njP in this proof will be played by U™ ,p = Ylii=i^ U™ ,p , 

h(p)-l 

• > - - - - in.' . i in 

2.11 



U i' P = E ^-((A 1± ^-A l ) + (B 1± ^-B,)) 2 -^ 

^ ' 2fc n n n n n n 

i=a e (p) 



—a 4 + r 2 

, 9 w et(>(2j) ~T a^(p) 



P 
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which again involves quantities from the big blocks only. The U™' p can be shown to be 
martingale differences, and the most involved part in the proof is to obtain 



/ Tl 

lim limsupw- — E 

P^°° n— >oo V ™n 



0, (2.12) 
which is the analogue of ()2.6|) . Let us focus on the remaining two steps as well. We set 



TT p [* l \ irrr, ( n2 P ( 48 P + d l 8 , 12 P + d 2 4 2 , l ^lp + d 3 4 

U? = ^ a( P ) s dW s , a(p). = (—^°. + + — j^r s 

for certain unspecified constants di, I = 1, 2, 3. In order to prove the stable convergence 

n C-(s) ^ (2 13) 



we use a well-known result for triangular arrays of martingale differences, which is due to 
Jacod [TI5]. In particular, the following three conditions have to be checked, where we call 
E^ 1 the conditional expectation with respect to J r ±. 



Jn(p) 

n 

2 Jn(p) 

n 



T,K<?)W) 2 ] / <Pfs d ^ ( 2 - 14 ) 
l=i Jo 

Up) 

, Jn(p) 

V n " 

where iV is any component of (W, V) or a bounded martingale orthogonal to both W and 
V. The final step linip^oo E\Uf — Ut\ = is obvious. □ 

Remark 2.2 It is quite likely that a functional central limit theorem holds as well, but a 
proof of this result appears to be somewhat more involved. In any case, the claim above 
is sufficient for most of the statistical applications we have in mind. □ 

Remark 2.3 The rate of convergence in Theorem 12.11 is n -1 / 4 , and it is optimal for this 
statistical problem. Indeed, a related parametric setting has been discussed in [18] a 
decade ago, and it was shown therein that this rate is optimal in the special case, where 
W and V are independent and r is a function of time and state, known up to a parameter 
6. □ 



Remark 2.4 As noted in the introduction, an alternative estimator has been defined in 
[8]. Apart from an additional truncation in the spot volatility estimators to make these 
robust to jumps in the price process, the main difference between both approaches are 
different orders of k n and l n . Their choices grant consistency for the integrated volatility 
of volatility without a bias correction as above, but as a drawback the optimal rate cannot 
be attained. □ 
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The limiting distribution in Theorem 12.11 is mixed normal, and in order to obtain a 
feasible central limit theorem we have to introduce a consistent estimator for the condi- 
tional variance f a 2 ds. This term is a sum of three quantities, and regarding the parts 
involving the process r, we will rely on the previously introduced local estimators. To be 
precise, for the integral over r 4 we will base an estimator on fourth powers of increments of 
<r 2 , and again a suitable bias correction is necessary, whereas the estimator for the mixed 
part is built directly from ff, and a 4 . . For the term involving powers of a only, there are 
several possibilities (involving standard power variations), but for computational reasons 
we choose one which is based on cr 4 , as well. Altogether we obtain the following result, 

ijn ° ° ' 

and a proof of this claim (just as for all later ones as well) can be found in the Appendix. 
Theorem 2.5 Set 

\nt\-kn [nt\-2k n {nt\-2k n 2 

_ 1 V CtM 2 f? (2) - V T 2 n 4 r (3) - V 71 (a 2 n 2 ^ 

II n II n n It nj^-, n n 

1=1 1=1 1=1 " 

Then we have the convergence in probability 



G$ A falds, (2.17) 
J o 



o 



o 



and as a consequence 

~ n _ 453 (3) n 486 (2) n 2 1038 (1) - [* 



Remark 2.6 Theorem 12.51 shows that a consistent estimator for J Q rfds is given by 

and its proof suggests that a central limit theorem holds with the same rate of convergence 
as before. In general, it is quite likely that this methods provides estimates for arbitrary 
even powers of integrated volatility of volatility. A concise theory is beyond the scope of 
this paper, however. □ 



The properties of stable convergence guarantee that dividing by the square root of a 
consistent estimator for Jjj a 2 ds gives a feasible central limit theorem for the estimation 
of integrated volatility of volatility. See for example [28] for details. Therefore we end this 
section with the following corollary. 
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Corollary 2.7 Under the previous assumptions we have weak convergence 

V "'fi \/c n 

for all t > 0. 



3 Model checks for stochastic volatility models 

Suppose that we are given a stochastic volatility model (with continuous paths), that is we 
have the representation for the log price process X and a volatility process satisfying 
(|1.2p . There is still a lot of freedom in the modelling of a 2 , and the various proposals in 
the literature typically differ in the representation of its diffusion part r. As noted in the 
introduction, a quite general class of stochastic volatility models is given by the so-called 
CEV models, in which t s = #(cr 2 ) 7 for some non-negative 7 and an unknown parameter 
6, and the most popular among these is the Heston model from [17], corresponding to 
7 = 1/2. 

Given the number of different stochastic volatility models, there is a lack of techniques 
in goodness-of-fit testing. We will partly fill this gap and employ a technique which was 
already used in [13] or jjSTj when dealing with local volatility models. Let us explain 
the methodology by taking the example of a Heston model, for which v s = n(a — a 2 ) 
and t 2 = £ 2 <t 2 . Since it is in general impossible to obtain information on the drift part 
of a semimartingale from high-frequency observations only, we will solely focus on the 
diffusion process. Thus our aim is to test whether the specific functional relationship of 
proportionality between r 2 and a 2 is true or not. Let us have a look at the stochastic 
process 

N t = (ff - 6 min a 2 )ds, Q min = argmin e / (t 2 - 6a 2 s ) 2 ds. 
Jo Jo 

Under the null hypothesis of a Heston- like diffusion process a 2 , the process Nt is obviously 
equal to zero for all t. Therefore a promising approach is to define an estimate Nt, which 
will be based on the heuristics from the previous section, and to prove weak convergence 
of the statistic ^Jn/k n (Nt — Nt) to a certain limiting process At, for which we use Theorem 
12. II and related results. Test statistics can then be constructed via functionals of \Jnjk n Nt 
which converge weakly as well and to the same functionals of At, if the underlying process 
is indeed coming from a Heston model. 

This approach is of course not limited to the Heston model, which is why we re- 
turn to general case. Suppose that (|1.2j) holds. We are interested in testing for r 2 = 
t 2 (s, X s ,a 2 ,6), where r 2 is a given function and 6 is some unknown (in general multi- 
dimensional) parameter. For simplicity, we will focus on the one-dimensional linear case 
only, that is 



H : t 2 = er 2 (s,X s ,a 2 ) for all s £ [0, 1] (a.s.) 
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Extensions to the general case follow along the lines of Section 5 in |31j . 

A test for the null hypothesis will be based on the observation that Hq is equivalent 
to Nt = for all t G [0, 1] (a.s.), where the process Nt is given by 

N t = J (V 2 - 6 min T 2 (s,X s ,a 2 fjds, d m i n = argmin e J (t 2 - 0T 2 (s,X s ,a 2 )^ ds. 

Assume that the function r 2 is bounded away from zero. Then a standard argument from 
Hilbert space theory shows that m i n — D~ X C (and therefore N t = V t - B t D~ x C), where 
we have set V t = Jq r 2 ds and 

B t = f r 2 (s,X s ,a 2 )ds, D= I T 4 (s,X s ,a 2 )ds, C= f t 2 t 2 {s, X s ,a 2 )ds. 
Jo J J 

To define reasonable estimators for the various quantities above let k n as before and recall 
(|2.2p . We set Nt = Vt — BfD C with Vt from the previous section, whereas we denote 

lnt\-k n . n-kn . 1 n-2k„ 

B t = - £ r 2 (-,X ± ,a 2 ), b = -Y j r\-,X x M\ C = ^ f 2 r 2 ^,X ± ,a 2 ). 

i=0 i=0 i=0 

In the sequel we will prove weak convergence of Nt — Nt, up to a suitable normalisation. 
Theorem 12.11 suggests that y/n/k n is a reasonable choice, and the following claim proves 
that two of the estimators converge at a faster speed, at least if we impose an additional 
smoothness condition on the function r 2 . 

Lemma 3.1 Suppose that the function r 2 has continuous partial derivatives of second 
order. Then we have 

B t -B t = o p (n- 1 / 4 ), D-D = o p (n- 1 / 4 ), 
the first result holding uniformly in t G [0, 1] . 

The above claim indicates that we have to focus on the terms involving f 2 , n only, which 
is familiar ground due to the results of Section [2 We start with a proposition on the joint 
asymptotic behaviour of Vt and C. 

Lemma 3.2 Let d be an integer and t\, . . . ,td be arbitrary in [0, 1]. Set 

^t u ...,t d ( s ' X s,^s) = alh 1 ,...,t d {s,X s ,a 2 s )h tlt ... jtd {s 1 X s ,o- 2 ) T 

with h tl: ... : t d {s, X s ,a 2 ) = (l[o,ti]> • • • > ho,t d ] > t2 ( s > o~ 2 )) T and a 2 as in Theorem \2.1\ Un- 
der the previous assumptions we have the stable convergence 

J^(v tl -V tl ,...,V td -V td ,C-cf ^ j\l!l ttd (s,X s ,a 2 )dW' s , 

where W is a (d + 1)- dimensional standard Brownian motion defined on an extension of 
the original space and independent of T . 
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We are intested in the asymptotics of the process A n (t) = ^n/k n {N t -N t ), and the 
preceding lemma basically leads to its finite dimensional convergence. The entire result 
on weak convergence of A n reads as follows. 



Theorem 3.3 Assume that the previous assumptions hold. Then the process (^4n(i))tG[o,i] 
converges weakly to a mean zero process (^4(£))tg[o,i] > which is Gaussian conditionally on 
J- and whose conditional covariance equals the one of the process 

{<*u (l { u<t } - B t D-\\\J, Xu,al)) } 
where U ~ £Y[0, 1], independent of T . 



As indicated before, convergence of the finite dimensional distributions is a direct 
consequence of Lemma 13.21 using the Delta method for stable convergence (see e.g. |14j). 
Tightness follows from Theorem VI. 4.5 in |23| with a minimal amount of work. 

Recall that Nt = for all t under the null hypothesis. Therefore Theorem 13.31 shows 
that a consistent test is obtained by rejecting the null hypothesis for large values of a 
suitable functional of the process {\/ n/ k n N t } t& \ ^. If we choose the Kolmogorov-Smirnov 
functional K n = sup te r 01 i \/n/k n \Nt\ for example, we have weak convergence under the 
null to sup tg [ 01 ] \A t \ as a consequence of Theorem 13.31 The distribution of the latter 
statistic is extremely difficult to assess, as it typically depends on the entire process (A, a 2 ). 
We will thus propose to pursue a different path and to obtain critical values via a bootstrap 
procedure, which we will discuss in the next section in detail. 



Remark 3.4 In practice, one should test beforehand, whether modelling via stochastic 
volatility is actually appropriate or not. At least two recent procedures should be men- 
tioned here: |27] propose a test which discriminates between local volatility and stochastic 
volatility models and which is based on the sign of increments of X and of increments of 
spot volatility, which tend to be equal if both X and its volatility process are driven by 
the same Brownian motion. [30| discusses more generally semi-parametric techniques for 
the estimation of the correlation parameter p between W and V. □ 



Remark 3.5 An alternative approach on model validation could be based on an appro- 
priate 1? distance, instead of working with empirical processes. To be precise, set 




with drain as above. Then the null hypothesis is equivalent to M 2 = almost surely, 
and a natural estimator for M 2 can be defined in the same way as for Nt- Nevertheless, 
the asymptotic theory for M 2 is a bit more involved, since a central limit theorem for an 
estimator of rfds is necessary and a discussion of such a theory is beyond the scope of 
this paper. See |14j for the asymptotic theory of the analogue of M 2 in the local volatility 
setting. □ 
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To end this section we define appropriate estimator for the conditional variance of 
A(t), which is given by 

s 2 = f a 2 s ds - 2B t D~ l f a 2 T 2 (s,X s ,a 2 )ds + B 2 D~ 2 f a 2 s T 4 {s, X s ,a 2 )ds, 
Jo Jo Jo 



due to Theorem 13,31 Empirical counterparts for Bt and D are obviously defined by the 

x 2 , 

i/n 



statistics B t and D, whereas Theorem 12.51 suggests that a local estimator for ot 2 1 is given 



by 



/c 2 V280' : 35 n n / kl 1225 . 

We obtain the following result, which can be proven in the same way as Theorem 12.51 



2 n 2 /453 2 2 x4 ^ 2A ^ n 6 346 ^ 8 



Theorem 3.6 Let t be arbitrary and set 



\n.t\-2k n [nt\—2k n 

- V a 2 -2B t D~ 1 ± V a 2 r 2 ^,X,,a 2 ) 
n z — ' n n i n n n n 

1=1 1=1 

[nt\-2k n 

+B 2 D- 2 - y a 2 r 2 (-,X,,a 2 ). 

i=l 



Then s 2 is consistent for s| 



As a consequence, each statistic \Jnjk n Xtjst converges weakly to a normal distribu- 
tion. This result will be used to construct a feasible bootstrap statistic in the following. 



4 Simulation study 

Let us start with a simulation study concerning the performance of Vt as an estimator for 
integrated volatility of volatility. Throughout this section we will work with the Heston 
model only, and the parameters are chosen as follows: /3 = 0.3, k = 5, a = 0.2 and 
£ = 0.5. Furthermore, we set X = and <7q = a. Note that the Feller condition 2na > £ 2 
is satisfied, which ensures that the process a 2 is almost surely positive as requested. So 
does r 2 , and it is obvious that (|2.ip holds as well. Therefore all conditions from Section [2] 
are satisfied. 

We begin the finite sample properties of the central limit theorem from Theorem 12.11 
in its infeasible version, which is 

That means we use the unobservable (conditional) variance J ' a 2 ds to standardise cor- 
rectly instead of using an estimator for it. We discuss the performance of this result for 
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different choices the correlation parameter p and the number of observations n, and in all 
cases we take n to be a square number and k n equals n 1 / 2 , so we have c = 1. Finally we 
set t = 1. 

[INSERT TABLE 1 ABOUT HERE] 

Table 1 shows the finite sample behavior of (|4.ip for p = 0, always based on 10,000 
simulations. We see that mean and variance are rather close to the limiting values in most 
cases. In the remaining columns we show some empirical quantiles in the tails, that is we 
state both a and the relative number of times where the statistic in (|4.ip was below the 
corresponding a quantile of the standard normal distribution. These values appear to be 
reproduced in a satisfying way as well. 

[INSERT TABLE 2 ABOUT HERE] 

The same situation has been analysed for p = —0.2, which corresponds to a moderate 
leverage effect of negative correlation between increments in price and volatility, and the 
results are in general comparable to the previous ones. Note that (4.1) does not depend 
at all on the choice of p, but some smaller order terms do as can be seen from the proof. 
These might affect the quality of approximation for finite samples, but apparently they 
do not. 

[INSERT TABLE 3 ABOUT HERE] 

For the feasible statistic from Corollary 12.71 the situation is somewhat different, as it 
takes more time for the asymptotics to kick in. Apparent is a slight overestimation of the 
lower tails of the distribution, which seem to originate from the relation of the estimators 

(3) 

Vi and G\ n . By construction, in cases where V\ is underestimating the true quantity, it is 
typically the case that increments of a 2 are relatively small. As these increments occur in 

(3) 

G\ as well, most likely the asymptotic variance is underestimated as well, which explains 
a too large negative standardised statistic. The same effect is visible for the upper quantiles 
as well (but resulting in an overestimation), and this simple explanation is supported by 
a detailed look at simulation results not reported here which reveal that the estimation 
of the asymptotic variance is extremely accurate for moderate sizes of v\ — JqT 2 , but 
becomes worse when the deviation is rather large. 

[INSERT TABLE 4 ABOUT HERE] 

As an example for an application in goodness-of-fit testing, we have constructed a test 
for a Heston-like volatility structure via a bootstrap procedure as follows: Based on the 
observation that for each t, ^Jn/k n Nt/ St converges weakly to a standard normal distribu- 
tion if the null is satisfied, it seems reasonable to reject the hypothesis for large values of 
the standardised Kolmogorov-Smirnov statistic Y n = sup i<n _ 2 / Cn \ \Jn/k n N i / n / Sj/ n |. Since 
its (asymptotic) distribution is in general hard to assess, we used bootstrap quantiles in- 
stead, and precisely we have generated bootstrap data , b = 1,...,B, following the 
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equation 

X* t = f a* s dW:, (a* t f = &+ f k{a- (a* s ) 2 )d S + | f a* s dV s *. 
Jo Jo Jo 

Here, W* and V* are independent Brownian motions, and we have identified a with the 
realised volatility of the original data (which is a measure for the average volatility over 
[0,1]) and defined | = 8 1 / 2 , since both quantities coincide under the null. Finally, we have 
simply set k = 59 /a such that Feller's condition is satisfied. Setting B = 200, we have 
run 500 simulations each. 

[INSERT TABLE 5 ABOUT HERE] 

Table 5 shows that the simulated levels are rather close to the expected ones, irrespec- 
tively of n. We have tested two alternatives from the class of CEV models, namely 

a t =(J o + K ( a ~ &1)ds + Vt and of = a 1 + k (a — a 2 s )ds + \fn / cr 2 dV s , 
Jo Jo Jo 

corresponding to 7 = and 7 = 1, respectively, and using the parameters from above. 
We see from the simulation results that the rejection probabilities are much larger for the 
second alternative than for the first, which can partially explained from two observations: 
First, the Vasicek model does not satisfy the assumptions from the previous sections since 
the volatility may become negative (in which case it is set to zero); second, our choice of k 
is responsible for a large speed of mean reversion in the bootstrap algorithm which makes 
it difficult to distinguish between a Heston-like volatility of volatility and a constant one. 
It is expected that the power improves for an entirely data-driven choice of k. 

[INSERT TABLE 6 ABOUT HERE] 



5 Conclusion 

In this paper we have discussed a non-parametric method on estimation of the integrated 
volatility of volatility process J Q r 2 ds in stochastic volatility models. Our estimator is 
based on spot volatility estimators, and just as for standard realised volatility we use 
sums of squares of these to obtain a global estimator V t for f* r 2 ds, up to a further bias 
correction. It is shown that Vt converges at the optimal rate n _1//4 , and we provide both 
an infeasible and a feasible central limit theorem for it. 

Given the variety of stochastic volatility models (in continuous time) which are used 
to describe financial data, there is a severe lack in tools on model validation. Our results 
somehow fill this gap, as we provide a promising method for goodness-of-fit testing in such 
models which investigates whether a specific parametric model for volatility of volatility 
is appropriate given the data or not. But several further applications are possible as well, 
particularly if we turn to the even more general context of models including jumps in price 
and volatility. Non-parametric inference in this context is rare as well, but to mention is 
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recent work by Jacod and collaborators on the existence and the form of joint jumps in 
both processes (see [23] and [ST]). 

Following our results, much more questions regarding jumps in the volatility process 
can be tackled now, and to explain possible extensions of our approach let us have a look 
at the case of jumps in the price process. Several statistical tools have been developed 
over the past years that help answering e.g. whether there are jumps or not in the process, 
whether there are finitely or infinitely many, or what in general their degree of activity is 
(see foremost [TJ— [3] , but also [7] or |26j). Most of these procedures are based on realised 
volatility and related quantities, such as truncated versions or bipower variation. Theorem 
12.11 indicates that similar methods are likely to work for the volatility process as well, but 
usually with the slower rate of convergence ra -1 / 4 . A detailed analysis is beyond the scope 
of this paper, however. 

A different issue to take microstructure issues into account which are likely to be 
present when data is observed at high frequency. Again it is promising to combine filtering 
methods for noisy diffusions with the method proposed in this paper to obtain an estimator 
for integrated volatility of volatility in such models as well, but the rate of convergence is 
expected to drop further. Again, precise statements on the asymptotics are left for further 
research. 

6 Appendix 

Note first that every left-continuous process is locally bounded, thus all processes appearing 
are. Second, standard localisation procedures as in [6] or [20] allow us to assume that any 
locally bounded process is actually bounded, and that almost surely positive processes can 
be regarded as bounded away from zero. Universal constants are denoted by C or C r , the 
latter if we want to emphasise depedence on some additional parameter r. 

6.1 Proof of (J2H2]) 

We have a couple of somewhat lengthy steps to show. Start with the following observation: 
If we set 

U?= £ ^(^-*l) 2 -- 2 aids- r*ds, 
i=0 n " n c Jo JO 

then a simple computation proves that 

\nt\-2k n 2 V nt \ 

1=0 1=1 

where the error term is due to boundary effects. Theorem 2.1 in [6] and the definition of 
k n give 
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uniformly in t. Therefore we are left to show 



lim lim sup * / - — E 



u? - u t' P 



= 0. 



(6.1) 



To this end, we need an auxiliary result on the increments of A and B and their approxi- 
mations, and we introduce similar terms over the small blocks. Set 



.7=1 



Di+k n - D ± = — / Tb e ( P ) (V Q4 _kn - V s )ds, both for i = b e (p), ... , ae +1 (p) - 1. 

n n K n J ± , 

n 

Then the following claim holds. 



Lemma 6.1 We have 

E]^^ -A ± - (A i+kn - A ± )\ r < Cripn- 1 )^ 2 , 

n n n n 

E\B l+kn - Bi_ - (Bi+hn - Bi_)\ T < Cripn- 1 )^ 2 , 

n n n n 

as well as 

ElA^ -A ± \ r < C r n~ r ^ and E\B, +kn - B ± \ r < C r n-' r l A (6.2) 

n n n n 

for every r > 0. The latter bounds hold also for the approximated versions, and the same 
results are true for C and D and their approximations as well. 



Proof. Note first that 



B 



i+k r 



-Bi 



(B i+k n — Bi_) 



n 
h 



u r dr + 



(r r — Ta e ( P ) )dV r ds, (6.3) 



and due to the boundedness of v the r-th moment of the first summand is bounded by 
C r n- r / 2 . Thus we focus on the latter summand only. Since a 2 and r 2 are continuous 
Ito semimartingales and both processes are bounded below by some positive constant, 
an application of Ito formula shows that a and r are continuous Ito semimartingales 
themselves and with similar representations. Therefore several applications of Holder and 
Burkholder inequality yield 



E 



n 

i- 

n 

i + k 



(r r - Ta e ( P ) )dV r ds 



< C r 



n 



E 



s+- 



(j r - T a e ( p ) )dV r 



ds 



e\( " (Tr-Ta^fdrY ]ds<C r (pn~ 1 ) r / 2 . 
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To see that (A( i+kn y n - A i/n ) - (A( i+kn y n - Ay n ) has the same property, note that this 
term can be written as the sum of two quantities, for which the first one is 



n 
k n 



E 



i+2 



(X s - X i+3-i )tisds 



kn r i±i 



+ 77" E / ^ " " (W s - Wi+i-iW^ )dW s . (6.4) 

3=1 » 



The other quantity has a similar representation, but involves integrals within the interval 
[(i + k n )/n, (i + 2k n )/n]. The r-th moment of the first term in (16. 4ft is of order n~ r / 2 as 
before, whereas Burkholder inequality, the martingale property of W plus the semimartin- 
gale representation of a give the desired bound for the approximation error concerning A. 
The bounds in ()6.2[) follow in a similar way. □ 

A simple consequence of Lemma 16.11 is that the remainder terms in f7™ are negligible, 
that is 



\nt\ —2k n 



lim lim sup oF " ~ 7J L, ^ " / 



i=c(p) 



f 

e(p) 



t?cZs 



using also boundedness of the processes on the right hand side and the definition of c(p). 
A similar claim holds for the approximation of the integrands, and we restrict ourselves 
to the big blocks and prove 



lim lim sup .. 

P-+OC n^oo V k. 



Mp) M£> 
I E Lap) ^ 



1=1 



(t q -r 



)ds 



(6.5) 



Recall (|2.ip . The result above follows from 



J n (p) P kM 



E V / 



1=1 



e(.p) I a l(p) 



Urdrds 



pk n n 



and 



Jn(p) ,5lM „ v 2 Jn(p) 



HE 



a»(p) 



£=1 



Mp) 

' i?W<W r dsV= V e( / " / i?WdW r dsV < Cp 2 n-\ 
°£(p) r / ^ \ hiM hiM J ~ 



since the terms involving i^ 2 ) and d^ can be treated in the same way. Note that the 
analogue of (|6.5p involving <r 4 instead of r 2 holds for the same reasons. We have further 



lim lim sup J 7— E } — -[jT p) "! 

p^oo ?woo y k n I |— ^ n \k^ & J - 



(p) 



0, 



(6.6) 



which by boundedness of a amounts to prove n 3//4 (/c 2 — nc 2 ) = o(l), and the latter is 



satisfied by definition of k n . Again, (|6.6[) holds over the small blocks as well. 
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The most involved part is of course on the error due to the approximation of increments 
of A and B. Our aim is to prove 



, J„(p) 6^(p)-l 

lim limsup A /-^-E| V V — f(Ui+ fc „ - + (B i+ k n -B l )f 

1=1 i=ae(v) 



i=a e (p) 

((L +kn -Ai) + (B i+kn -B±)) 2 ) =0, (6.7) 



and from the proof it will become obvious that similar methods give the analogous result 
for the approximation via C(i+&„) /n — Q/n an< ^ ^(i+k n )/n ~ ^i/n within the small blocks. 

First, the binomial theorem tells us that we can discuss the approximation for B, the 
one for A and the mixed part separately. Using further x 2 — y 2 = 2y(x — y) + (x — y) 2 
and xx' — yy' = (x — y)y' + y{x! — y') + (x — y)(x' — y'), we see from Lemma IBTTI and the 
growth conditions that (|6.T[) follows from lim^oo limsup^^^ Ylt=i E|£njp| = with 

J n (p)b e (p)-1 _ _ _ _ 

Ln\ = \~r~ — ( (B i+k n — B i ) — (B l +k„ — Bi)) (B i+k„ — Bi), (6.8) 

V "Vl — ' — ' K n \ n n n n / n n 

£=1 i=o^(p) 
J„(p)6 f (p)-1 

L nl = \T L Y Y T (( B ^ - Bi)- (B i+kn - Bi)) (A i+kn - Ai \ (6.9) 

£=1 i=a e (p) 



Jn(p)b t (p)-1 _ _ _ _ 

L nl = \hrY Y\ — ((Ai+kn - Ai)- (Ai+ kn - Ai)) (A i+hn - Ai), (6.10) 

V kn — — K n \ n n „ n / n n 

*=1 i=a«(p) 

Jn(p)6i(p)-1 . _ _ _ _ 

4 4 p = A/r-y V -f(A, +t „ - Ai) - Ui +fc „ - ^l)) (-Bi+fca - Bi). (6.11) 

«=1 i=o^(p) 

Let us start with the claim for (16. 8ft and we discuss the part within (|6.3p involving 
first. We have v r = [y r — v ai (p)/n) + u ae(p)/n- The latter term is treated using Lemma [67T] 
as we have 

Jn(p)b t (p)-1 i+k n s+ kn 2 

T~ E (Y1 J2 T*( " / (6.12) 

3 J„(p) 6f(p)-l i + kn. „ s+ fen 2 

= E ( L " / "^Ws)fe-B,)) ^Cpn- 1 / 2 , 

which converges to zero for any fixed p. For the other one we use left continuity of v. 
From Fubini theorem and two applications of Cauchy-Schwarz inequality we have 



E( / / (t/ r — z/ q^ (p) )drdsj < / E|i/ r — | dr, 
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therefore Lemma 16, II shows 

Jn(p) bi{p)-l 

r E \ E E £ 

€=1 i=a*(p) n 



J„(p)b f (p)-1 l±2kn 



< Cn- 3 / 4 ^ E ( /• " E|t/ r -i/ n<(p) | 2 dr 

f=l i=ai(p) n 



{v r — V a t (p) )drds ) (B i+kr,. — Bj_ ) 

n 

1/2 



(ri| 2 dr) < C( / E|f r - z/^^dr 



(6.13) 

1/2 



where [p, n](r) denotes the largest a^(p) smaller than r. We call 7(71, p) the right hand side 
above. For fixed r and p, \p,n](r) converges to r from the left, so \ v T — Vp^r)^ converges 
to zero pointwise as well, since v is left continuous. By boundedness of v and Lebesgue 
theorem, 7(n,p) is a zero sequence for all p, and we are done with this part as well. Finally, 



-J n (p)b e (p)-1 .. i+k r 

n 1=1 i=a e (p) n n 



(r r — Ta e ( P ) )dV r ds ) (B i+k n — Bj_ 



(6.14) 



is treated as follows: Call t[ the sum of the last three terms in (|2.ip . Then r T — T ai u,y n 
fa e (p)/n u u du + ( r r ~ T a e (p)/n)i and usin g Lemma Owe have 



Jn(p) bi(p)-l 

■«|E E 

£=1 i=ae(p) 



> + kn 



a e(p) 



oj u dudV r ds ) (B z+k n — Bj_ 



,1/2 



n 1 



-/2' 



which converges to zero for every p > 0. Have a look at the first Brownian term in 
t' t — T^,y n , for which the decomposition 



a l(p ) MM 



holds. We use the fact that (W, V) is jointly Brownian. Conditioning on J^^/m proper- 
ties of the normal distribution show that 



•Ai(p) b e (p)-l 

E E E 



k n 



n 



1=1 i=a e (p) 
Jn(p) b e (p)-l 

E E ( E 



k 2 



(p) ^ dW u dF r ds) (5*^ - 5±)J (6.15) 



n 



. i + k n 



dwwds ) {B^ - B ± 



=1 i=a e (p) 

which is of order p 2 n -1 / 2 . On the other hand, Cauchy-Schwarz and Burkholder inequality 
give 

, i + kr, 



E 



s+- 



(VV - $%)dW u dV r ds 
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thus from Lemma 16.11 and Cauchy-Schwarz inequality again 

Jn(p) h(p)— l r i±k 



V£ e Ie £ /. " .„^-^ m " dKds ) {S ^- s i ) 

1=1 i=a£(p) n n 

Jn(p)b e (p)-1 i+k„ rs+ !hL ,. r 



=1 i=a e {p) 

Jn{p)b t {p)-1 l + kn 

EE(// ' 

^=1 i=ai(p) n 

Convergence to zero for any fixed p can be deduced in the same way as for (|6.13|) . The 
remaining two summands in t' t — T~' ae , p y n can be treated similarly, thus the claim for (|6.8p 
is entirely shown. 

Note that the first two steps go through for (|6.9p as well: To obtain the analogue of 
(|6.12p we need the unbiasedness of A( i+ i~ n y n — Aij n and the bounds of Lemma 16.11 only, 
and the latter are used for (16.131) as well. In the proof of (I6.14|) the only difference regards 
(j6TT5j) . as 

«*- e 5(7," / " i, t ,/^ w ^y^-h? 

l = ai(p) n n 

is not unbiased. Nevertheless, let us have a look at E™ / ^(Y^ 1 ). Since TV and V are jointly 
Brownian, standard Ito calculus yields 

. i+kn „o-L^» - i+3 . 



J, " E [ ( J " (W r - Wa^ )dV T >j ( (W u - W i+ i-i)dW« 



ds 



< C 



n 3 



for arbitrary j = 1, . . . , 2k n and regardless of i. Using the representation in (|2.10p we obtain 
I^V(p)(^ n )l — P n l - Furthermore, the previously used arguments yield E(l^ n ) 2 <p 3 n -3 / 2 . 
Therefore 

Jn(p) 2 Jn(p) Jn(p) 

>( E = £ E w) 2 + 2 f E w™) 

£=1 f=l m<£ 

< C p 2 n~ 1 / 2 + 2^- ^ E (|E^ (p) (y^)||K^|)<C(A- 1/2 +P 1/2 n- 1/4 ), (6.16) 

which gives the analogue of (|6.15p . so the claim for ()6.9[) is entirely shown. 

Let us come to the proof of (|6.10p . Recall (|6.4p . As for Lemma 16. II we will only show 
the claim for the summands involving integrals over [i/n, (i + k n )/n], as the entire proof is 
obtained in the same way. Let us start with the ds-part. Using the bounds from Lemma 
16.11 we have 

n „l v — v \ - n 



A f 



E | E E £(E L ( L - Ai) < Cn" 1 /* (6 17) 

D 1 ■ - Tl " — 1 " " 



1=1 i=a t (p) n j=l 
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and that 



Jn(p) bt(p)-i k„ .i+i 

£=1 i=o £ (p) n J^l" 7 ^ - 



cr u dW u )(n s - Ua((v))ds)(Ai+ hn - A±) (6.18) 



n n 



can be bounded by a certain zero sequence ^y(n,p) in the sense following (I6.13p . We set 
further 



i+i 



i=ae(p) 3=± n n 



a r dW r ds\ (A i+k n — A±). 



n n 



Note for j\ < j'2 that 



+31 i+32 
n I n 



E 



£WV / i+J1 ,i /i+j 2 -l 



dsdt = 



by conditioning on -^ r (i+j 1 -i)/n an d using the martingale property of W. Therefore and 
from Lemma |6. II we have the bound 

I Jn(p) 

E|Zf| < CA^V^n"- 3 ^ 1/4 < Cprr 1 , thus V E|Z?| < Cn~ 1/4 . (6.19) 



The remainder part of (|6.4p is decomposed into three terms, namely 



cr u dW u ) (a s - a ai ( P ) )dW s 



k n .i+i 

-y\ n , , 

fe n ^— ' L J i+i-i V h+j-i 

j = l n 

i+i rs - i+i 

I ( (<T U ~ Va e (p))dW u )(Ja l ( P )dW s + 



(6.20) 



i+j-l 



We begin with the first term, and to this end recall that a is a continuous ltd semimartin- 
gale (and in particular that its driving Brownian motion is V and we call its volatility 
process f which has a representation similar to (|2.ip ). It is sufficient to prove that 



-Jn(p)b l (p)-l 2 k n 

\ \ \ (J 



i+j 



(W s -Wi+i-i)dW i 



fc n i+i s 

£ ( i+i-X ~ ^)*) 



(6.21) 



n n 



converges to zero in the usual sense, and the same arguments as above show that the 
error due to replacing a s — cr ae ( p y n by Ja t {p)/n^ r< ^ r 1S bounded by Cpra -1 / 2 . Then we 
successively replace a u by the corresponding a a ,u,\i n and for f as well. This error is 
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of order Cp l / 2 n 4 / 4 each, and finally we are left with ^Jnjk n Yli=i^ a a e ( P )/n^a-i(p)/n^'" ■> 
setting R} = n 2 /k 3 S^S^ with 

k n k n „±ri „ s „ s 

€ = e - ^ € = E /L ( y^ ^ y^ ^K- 

J — In J — In n n 

By successive conditioning we obtain E(S^) 4 < Cn~ 3 , as each summand of the quadruple 
sum has a non-zero expectation only if it comes from two pairs of equal indices. Proving 

(2) 

an upper bound for the fourth moment of is a bit more involved, as one has to be 

(2) (2 1) (2 2) 

careful with conditioning. We decompose S^ n = +<S^ by splitting the dV s integral 
into (V s — V"(j + j_i)/ n ) + (Va+j-iy n — V ae ( p y n ). For the first term we obtain the bound 

E(,sf^) 4 < Cn -5 in the same way as for E(S^) 4 , and we focus on 



k n „i+i 



s t = E L , ( w > - w^dwMv^ - V atW ). 

^— ' VJ i+3-l n In 

J = l n 

If one index in the corresponding quadruple sum is larger than any other, then condition- 
ing gives a zero expectation. Therefore the only non-trivial case with three indices being 
different is when the two largest are the same. Nevertheless, a straight-forward computa- 
tion shows that the expectation is then zero as well, and we obtain E(S'^ 1 ^) 4 < Cp 2 n~ A . 
Overall, using generalised Holder inequality, 

E(^) 2 < Cp 2 k 2 n ^n- 3 / 2 pn- 2 < Cp 3 n- 3 / 2 . 

Furthermore, properties of the normal distribution prove E(i2?) = 0. Thus 

Jn(p) J n (p) 

f E ( E ^aM^SS) < Cf E E W) 2 ^ Cp 2 n^ 2 . (6.22) 
Kn i=i " n Kn i=i 

The same arguments work for the second term in (|6.20p as well. Finally, 

, Jn{p)b t (p)-1 2 k n .i±i k n tti „ s 

/^ee |^e(/;_,w-^)^oe(/ + ;_ 1 / +j _,^''^. 

v n £=1 i=a e {p) n j=l J —^— j=l J —^— J —^— 

can be bounded by Cn -1 / 4 . 

It remains to discuss (|6,lip . The analogues of ()6.17f l~ (|6.19j) follow immediately from 
Lemma 16. 11 and so does the final step above. Thus all we need to prove is negligibility of 

Jn(p)b e (p)-1 2 f i + k n 

£E E M// T^(v s+ ^-v s ) ds 

= 1 l=a l (p) n 



k i+i 

E ( °udW u (a s - a a _^)dW s ) , 

j = 1 n n 
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and as before it is sufficient to discuss J ^Yljli T a l { v )ln (J a l Xv)ln^a t { v )lnG n t only, where 



6/(p)-l 2 

r n _ y- n K (i) q m K (i) 

c \ n 



. '+fcr, 



(V,,*a " V a )ds. 



We have E(lf f ( ^) 4 < Cn~ 3 and E^) 4 < Cp 2 n 4 as before. However, in contrast to 
the previous result is not unbiased. Therefore we compute a conditional expectation 
again. We have 



i + kn k n 



i+j-l 



°l(p) 



as before, thus |E^ (p) (G™)| < Cpn,- 1 . Since |E™ (p) (G^) 2 | < Cp 3 n' 3 / 2 as well, the result 
follows as in (|6,16p . 

The final step in the proof of f)6. 1 1) is concerned with the contribution of the small 
blocks, so we have to show that 



J„(p) a e+1 (p)-l 



lim limsupW-pE V ( V — ((CW - C ± ) + (Di^ - D ± )f 



=1 i=b e (p) 



2k„ r6n 



71 



, 2 "Mrt + t ^(p) 

" / 7l n n 



0. (6.23) 



For this purpose and for later reasons, we compute the conditional expection of the ap- 
proximated increments, and we will do this for the A and B terms only. We have 

K(pM^ - ~ A -) 2 = ^^E E (W+ kn+J Wf - (A? +k W) 2 ) 2 = ±at^ (6.24) 

n 



as well as 



E" £ (p)(-Bi±fcn - B, 





i + kn 




- 2 n \ 2 








: 

n 


s: 

n 


- 2 U \ 2 

~ A U2 T ^(p) 
""n n 






: 

n 


i: 

n 


- n \ 2 1 




, k 2 
"'n 


U2 a eW I 
""n n J 




.n 2 



n(V s+ !^ - Vs){V r+hk - V r )]drds 



r+- 



r H — - — s ) drds 
n 



i -\- k n 



n 



2 k 



n 2 



s) )ds = - — r 

3 n 



a t (p) 



The expectation of the mixed part is zero. Therefore the U« are indeed martingale 
differences, and we have 



bt(p)-l 



i=a e (p) 



C± ) + {D 1± ^-D ± )) 2 -' 2 ^ 

n n n fl 



6n ^ 4 Mr 2 
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as well. (I6,23p then follows from the fact that 

n Mp) a e+1 ( P )-i 3 _ _ _ _ 2k 6n 2 

e=i i=b e (j>) 

is bounded by a constant times using Lemma loTTl and p — > oo. □ 



6.2 Proof of (12351) 



Let us check the conditions for stable convergence in this step, where particularly the 
proof of (|2~T4l) is tedious. Write U^ p = E^ =1 C/™' p,s with 



E i((^-^) 2 -^' 4 

*=»/(?) 

&*(p)-l n 



u, 



n,p,2 



E ^((^ 



i=a e (p) 

Mp)-i 



2 Zkn 2 



n,p,3 



- — (A i+k n — Ai)(B i+k n —Bi). 



«=*/(p) 



It turns out that only the (U^' p ' s ) 2 terms are responsible for the conditional variance, 
whereas the remaining mixed ones are of small order each. Let us start with the pure a 
part in the conditional variance which is due to 



b e (p)-i 



n,p,l\2 



i,m=a t {p) n 



Due to conditional independence of Ar i+kn \/ n — A i j n and Ar m+kn \/ n — A m i n for \i— m\ > 2k n 
and because of (|6.24p we have to discuss the remaining cases with \i — m\ < 2k n only. Let 
us stay away from the boundary first and compute 



b e (p)-2k n i+2k n 

E E 



9 



4k 2 

i=a£(p)-\-2k n m=i—2k n 



(A i+k n ^-.i-) (A m + fcn At 



, 2 u a e (p 



be(p)—2kn i—1 

E E 

i=ae(p)+2k n m=i—2kn 



(A i+k n — Az) 2 (A m + k n — Am) 2 ~ 

n n n n 

using Lemma 16. 11 Recall the definition of H". The task is to simplify 



al^+O^pn- 3 / 2 ), 



16 

~U2 

n Tl n 



E 



(A. i-\-k n A. j/) (A. rn-\-k n A.L 



4 

n 8 

™ " Jl,i2,i3,j4 = l 



E 



i+j2+k„ 



Tjn \ / ttu 
I2 i+j 2 )\ rL m+j3+k n 



+j3)\ 11 m+jA+k n 
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In order for each expectation on the right hand side to not vanish, at least one index 
of each paranthesis has to agree with one of another. Note that j\ = j'2 and j'3 = j'4 
corresponds exactly to the subtracted mean (apart from the small order case of four equal 
indices), which is why we focus on the few cases left. Suppose that i + ji = m + In 
this case either i + j'2 = m + j± or i + j 2 = m + k n + j'4. The same options exist for 
i + ji = m + k n + j'3 . By symmetry, the cases where indices within the first and the fourth 
paranthesis agree, can be discussed in the same way, which explains an additional factor 2. 
Let m < i — k n . Then the only possible case is i+ji = m + k n + j'3 and i+32 = m + k n + j'4, 
whose contribution to the quadruple sum is 

2 £ n(Hr +n ) 2 (H^ +n f] = — A (2k n - (i - m)) 2 + 0(n- 7 /2), 

iij2=i 

l<Ji +i-m— k n <k n 
l<32+i—rn—k n <k n 

using E(i?™) 2 = l/(2n 2 ). For m > i — k n all options are allowed, and a careful observation 
shows that the quadruple sum becomes 

2 2 1 

-r{kn - (t - m)) 2 - - m){k n - (i - m)) + — j(i - m) 2 + 0(n~ 7 / 2 ) 
4fc 2 - 12k n (i - m) + 9(i - m) ^ _ 7/2 

where the first term above is due to i + j\ = m + jz and i + j'2 = m + Hi the second 
one belongs to the mixed parts, and the final one comes from i + ji = m + k n + j'3 and 
z + J2 = m + k n + J4 again. An index transformation gives in total 



b((p)—2k n i-1 



E E ^("^.[(V-W-^-V]-^) 

i=a e (p)+2k n m=i-2k n n n 
og b e (p)-2k n k n 

= a liM ^6 E Y.{ m2 + ( Ak n ~ l2k nm + 9m 2 )) + P (n- 3 / 2 ) 
n i=a e (p)+2k n m=l 

n. ""n 



A similar argument for the missing boundary terms reveals that their contribution equals 
a a e (p)/nd~i/kn f° r some unspecified di, independently of p, and again up to some small 
order terms. Overall, 



'a<,(p)\ u e ) - u «/(p) — p- 
Similarly, the main part of E™ (p) (l/""' p ' 2 ) 2 is due to 

b e (p)-2k n j-1 

y 



E E ^JK^S^-B^B^-B^]-^ ). (6.25) 

i=a^(p)+2fc„ m=i—2k n 
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For a centred normal variable (N\, N2, N%, N4) we have 

E^N^NaNi) = E(N 1 N 2 )E(N 3 N 4 ) + E(N 1 N 3 )E(N 2 N i ) + E^A^E^A^). 

Applied to the increments of V in (B( i+kn y n - B i/n ) 2 (B( m+kn y n - B m/n ) 2 we see that the 
first of the three terms above corresponds to the mean, thus by symmetry (I6.25P equals 



i-l 



bt(p)-2k n 

E E 



9n 4 



£.6 T a i(p) 
i=ai(p)+2k n m=i—2k„ n n 



. i+k n 



m + k^ 



- Vs){V r+h ^ - V r )]drds) . 



For m < i — k n we have 

, i + k n a m + k n 
n 

E[(V, , ^ - V S ){V.^ - V r )]drds 



m J r 2k n 

1 f n m + 2fc n , 2 , (m + 2k n — ir 

2 A { ^—- S)dS = 6^ ' 



m-\-2k n m-j-fc 



and analogously 



E[(K+*a - V s )(V r+ ^ - V r )]drds 



Til ~f~fcyj 



(r H s)drds 

kn n 



(6.26) 



H — - — s)drds 
n 



+ 



m-\-k n m+fc 



(s H r)drds + 

77, J m+fc n / „ fc r , 



(r H — — — s)drds 
n 



6n 3 



for m > i — k n . Therefore ()6.25j) becomes 

b e ( P )-2k n 4 fc n 2 

E T^iM Ei(™ 6 + « - ^k n + 3m* f) = P^rt^ + P ( P n~V% 



k 6 MP) ^ 3 g n 6 
i=a e (p)+2k n n m=l 

so for a certain cfe we obtain 



1 m 

K iP) (ur 2 ) 2 = -tmip^ + ^ + o P (pn-^). 

The term E^^([/™' p ' 3 ) 2 is responsible for joint part of the conditional variance. Again, 
we discuss only 

bi(p)-2k n i-\ 



^] TJ-^O^fp) (^ ' + fcn - Ai)(5 , + tn -B.)(A m+t„ ~ All!. ) ( B m + k n ~Bl 

l\i/y-> L 71 71 71 71 71 ' 71 

i=a e (p)+2k n m=i—2k n 

in detail. Note that 



(A i + kn — A i )(B i + k„ — B i )(A m + k„ — A™)(B m + k„ ~ Bl 

n n n n n n n 1 

i + k n m+kn 



4 

^_ 4 2 \ " 

, 4 a a e (p) T a e (p) 2—1 I 



(6.27) 



iij2=i n 



H^ +j2 )]drds 
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with the previously introduced notation. Thus we have to compute quantities like 
H(Y ^ - V S )(V + hn - V r )H^ h H^ +j2 }, 

n n 

and one obtains that the expectation above is non-zero only for intervals satisfying the 
condition [s, s + k n /n] n [r, r + k n /n] ^ and indices with i + ji = m + j 2 . As usual, let 
m < i — k n first. Then the double sum in (|6.27p becomes 

2k n — (i — m) m + 2k n m + kn 

- E /. / " n(V r+ ^-V s )\H? +j f]drds, (6.28) 

and the expectation factorises for all but a small order amount of choices for s and r. For 
(|6.28p we thus obtain 

2k n — (l — m) m + 2k„ m + k n . 

k n (2k n -{i- m)) 4 



^2 E L * f ! (r + ^-s)drds 



2n 2 ^— ' l± L-hn n 12n 5 

plus a term of small order 0(n~ 7//2 ), where we have used (|6.26p . Analogously, for m > i— k n 
the double sum equals 



E ^w, - ^r +i ) 2 - E w ,L ' 



l<j'+i— m<k n l<j+i—in—k n <k n 

4/c 3 - 6(i - m) 2 fc„ + 3(i - m) 3 
6n 3 

_ (2fc n - 3(i - m)){Akl - 6(i - m) 2 k n + 3(z - m) 3 ) 
~ 12^5 ' 

again up to some 0(n' 7 / 2 ). Overall, the main part responsible for the mixed terms is 



bi(p)-2k n k n 




i=a e (p)+2k n n " m=l 

p—aj^Tt^ + Op(pn~ 3/2 ), 



E -T6 a tlM T lm E \ ( 8k n ~ l2k l m ~ l2k n m2 + m " m3 " 9m 



4 ^ m 4 



n n 



so finally 



Tpn fTT n ,P,3\2 4 2 + U3 _ , 

E a,(p)(^ ) =^Mp)^Mp) + Op(pw 

n n 



-3/2) 



It remains to prove that the mixed parts of E^^[(?7"' p ) 2 ] are O p (pn~ 3 ' 2 ) each, which is 
another simple but tedious task. We will drop these computations for the sake of brevity. 
Altogether, we have 
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thus flZHD holds using k n ~ cn ' . Simpler to obtain is (I2.15P , as Lemma 16.11 gives 

2 Jn{p) 3 

which converges to zero in the usual sense. Finally, one can prove 

b e (p)-l 

K(p) \ £ ^((^i+M ~A ± ) + (B^ - B ± )) 2 (N ae+l(p) - Natto)] = 0, (6.29) 



=at(p) 

where N is either W or V or when N is a bounded martingale, orthogonal to (W, V). 
Focus on the first case and decompose ((A( i+kn y n — Ai/ n ) + (B( i+kn y n — l?j/ n )) 2 via the 
binomial theorem. For the pure A and the pure B term, the claim follows immediately 
from properties of the normal distribution upon using that cr ae ( p y n or T a ^ p y n are -7 r a< ,( p )/ n 

measurable. For the mixed term, one has to use the special form of A( i+kn y n — A^i n as 
a difference of two sums, and a symmetry argument proves (|6.29p in this case. For an 
orthogonal N, we use standard calculus. By ltd formula, both (A( i+kn y n — j4j/ n ) 2 and 

(.B(i+fc n )/ n — -Bj/n) 2 are a measurable variable times the sum of a constant and a stochastic 
integral with respect to W and V, respectively. Thus (I6.29D holds. In the mixed case, we 
use integration by parts formula to reduce (A( i+kn y n — A i i n ){Bu^_ k \/ n — By n ) to the sum 
of a constant, a dW and a dV integral. Then the same argument applies. Altogether, this 
gives (f2TT6l) . □ 



6.3 Proof of Theorem 1231 



Let us begin with the proof of (I2.18p . for which we write 

r 2 i = ^((A^ -A ± ) + (B^ -B ± )) 2 - 6-^cfl , o\ = Y>1 \^ +j W\\ (6.30) 

n 2K n n n n n fe„ n n 6K n Z ' n J 

1 = 1 



with 

1^ _3 A = f 4 53 (|A? +fc „ + ^| 2 - \X} +j W\ 2 ), 

n 

T±{V s+ kn - V s )ds. 

i_ n n 

n 

First of all, we have 

\nt\-2kn 

G?l = - E r\ai + O p (n-y% 

Tl n n 

1=1 
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since a and r are Ito semimartingales and thus the techniques from the proof of Lemma 
Oshowthat both f 2 , -r 2 = CUn" 1 / 4 ) and of, -a 4 , = OJn' 1 ^) hold. Note further 

1 1 i/n i/n P\ > i/n i/n P\ ' 

by conditional independence that 

\nt\ -2k n 

(- £ (^M-EF[rM])) =O p (n-^). 

\ Tl n n n n ' 

1 = 1 

A simple computation gives [^/ n of/ n ] = T i/ n a t/n + O p (n -1 / 2 ), so (I2.18P follows again 
from the Ito semimartingale property of both processes. The same techniques prove (|2.17p . 
and we also have 

\nt\ —2k n 

G?l= y ((Ai+^-A^ + iB^-B^ + Opin- 1 ^), 

* — ' n n n n 

1 = 1 

so all we have to do is to compute the conditional expectation of each summand. For the 
first term, this is simple, as we have 

up to an error of order n -3 / 2 . It is simple to show 

n n n n OTL n n 

since we have seen already that the expectation factorises up to a small term error. Finally, 
properties of the normal distribution give 

_ T7 4 / f S \2 Ah 2 

^[(B 1± ^-B ± f] = l2-rU / E[(V r+ h - V s ) 2 ]drds) =^|r 4 . 

n n fe* „ \ J± J± ' T n / 3n Z n 

7i n 

The two remaining terms have zero expectation. □ 



6.4 Proof of Lemma 13.11 

We will only prove the first result. Note that 

[nt]~2k n ± 

B t -B t = V / (T 2 (s,X s ,a 2 )-T 2 (-,X ± ,a 2 t ))ds + O p (n- 1 / 2 ), 

1 = ~ 

the error coming from border terms in Bt, for which we have used boundedness of the func- 
tion r 2 , due to differentiability and the assumption that any process involved is bounded 
itself. We have 

T 2 (s,X s ,a 2 )-T 2 (-,X ± ,a 2 ± ) (6.31) 

n n n 

= (r 2 (s,X s ,a 2 )-T 2 (-,X s ,a 2 )) + (r 2 (-,X s ,a 2 )-r 2 (-,X ± ,a 2 ) 

n n n « 

+ (T 2 ^,X ± ,a 2 )-T 2 (^,X ± ,a 2 )) + (T 2 (^,X ± ,a 2 )-T 2 ^,X ± ,a 2 ). 

fl» TL n n fl n „ Tl n n 
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The first three terms can be discussed in the same way. From differentiability we may 
conclude 

[ntj— 2k„ „i . [nt\-2k n i_ 

£ j^(r 2 (s,X s ,a 2 )-r 2 ^,X s ,o 2 s ))ds = £ -r 2 (£, X„ <t 2 )(s - l -)ds 

i=0 n i=0 n 

for a suitable £, and the term is obviously of order n . In the same way, we see that 
the second and third term in (16.311) are of order n _1//2 each. For the last quantity, we use 
twice differentiability and Lemma 16. II to obtain 

TI n n n n n Q(J L Tl n n n n 

uniformly in i. Also, & 2 , n — ofi n = Mf + O p (n -1 / 2 ), where the 

k n r i±i i+kn 

Mf = — V 2 / (X s - X i+j -i )a s dW s + — / / T u dV u ds 

J — 1 n n n 

are martingale differences of order n -1 / 4 . Therefore Lemma 13. II follows from 

[nt\-2k„ _ . 

\n ^-^ ocr z n n - J 

i=0 

where we have used Lemma 16. II again plus E[M™M™] = for \i — j\ > k n . □ 
6.5 Proof of Lemma 13.21 

From ()2.ip and by differentiability of the function r 2 we have 

71 2 fc 77 

n Z ' V n n n „ n IT. n n / 

i=0 

The first claim is 

- V f^fr^.I^^-r^-,!,,^))^^ 1 / 2 ). (6.32) 
1=0 

Recall from the previous proof and set 

k n -HI i + k„ 

Mt = — y2a\ (W s - Wi+^t)dW s + — Ti / (V s -Vi)ds. 

k n ~ f n J i+J- 1 n k n n Ji n 

j = l n n 

Standard methods give Mf - Af? = Opin- 1 / 2 ). Using the mean value theorem, we 
conclude that the left hand side of (|6.32p equals 

-t Tl 2/c n r\ 

- £ ^ + O p (n- 1 / 2 ), ^ = ^ r 2(l )X± ,a 2 )r 2 Mr. 

Tl 0(T A JI n n n 

i=0 
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$™ is obviously of order n 1//4 , so by conditional independence we have 

E (- E W " E M)) = ^ E E ^ " E MX^ - E?[0?])] < Cn-\ 



|i-i|<2fc„ 



We omit to compute E™[#™] in details. Standard arguments give \E?[&?]\ < Cn- 1 / 2 , fr om 
which (|6.32p follows, so we have 



n— 2k 



c-c = - V (ff- r 2) T 2 ( l )X , )CT 2 ) + (n -i /2) 

i=o 

We will use the same blocking technique as in the proof of Theorem 12.11 now. Let I n (p) 
be defined as J n (p) before, but with t = 1. We proceed in two steps. The first one is 



, -. In(p)b e {p)-1 . 

*=1 »=o«(p) 



and there is of course a related result concerning the small blocks. This result is in fact 
quite simple to show. The assumptions on the function r 2 and growth conditions of 
continuous Ito semimartingales reduce the claim to 



lim limsupJ— E - V ^-t 2 ( — , X ai(p ) , o\ {p) ) V 

t=l r=eu (p 



in(p) Q _ b e (p)-l 

t\ — r\ ){Xi_ — Xa £ ( P ) 

n n n 

i=a e (p) 



and an analogous one involving the partial derivative with respect to a 2 and increments of 
a 2 , which can be discussed in the same way. Let t?, be defined as f 2 j n in f|6.30j) , but with 

A and B replaced with A and B, respectively, and o"j/ n with <y at (p)/n- Denote with Ng an 
unspecified J^p^-measurable random variable. Then Lemma [6. II and growth conditions 
again show that we might as well prove that 



I — 1 In(p) b e (p)-l 

\ V- E \~J2 N e E (Tl-Tl £M )cr a _^(W ± -W aeiP : 

V K n ln , . , , n „ n n 

e=l i=ae(p) 



becomes small, which follows in a similar way as (|6.22p by conditional independence. For 
the second step recall U?' p from (|2.1ip . We will show that 



I In(p) , , b e (p)-l 

hmlimsup./^E ^r 2 (^,X Mp) ,< (p) )( E H# =0(6.33) 

«=1 i=a-e{p) 

holds. For the increments involving A and B within f?. — rf,, the proof is identical to 
the one of (I2.12D . Let us show 



-fn(p) b e (p)-l 2 k n 



lim lim sup Jf E E ^ E i" E ( A ^ 4 " ^ 

l=\ i=a e (p) j=i 



(p) 



0, (6.34) 
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for which we use the decomposition 

(A« +J X) 4 - 3n-V a w = ((A« +J X) 4 - attti (A« +J ^) 4 ) 



+a 4 +i _ 1 ((A^,WQ 4 - 3n- 2 ) + 3^(0^-! - < 



(p) . 



Plugging in the first term on the right hand side gives a small order in (16,34p due to the 
growth condition on a. For the second one, note that 

f E (l> S ^E^((Ar + ^) 4 -3n-)) 

£=1 i= 0/ (p) n j=l 
i»(p) 6<(p)-l 4 fc„ 

= f E E ^E E (^lti-((A? + ^) 4 -3n- 2 ) 2 <Cn-. 

£=1 i=a e (p) 3=1 

For the third term above we use the standard argument of approximating af. — 
a a {p)/n k v an ^ 7 a f (p)/n- measura ble times an increment of V plus conditional independence. 
For the same reason, the error due to t?, — t^w^ is small, which gives (16.34[) . Together 
with (I2.12D we have thus shown 

In(p) 

a lim sup 4 / — \ (Vt - Vt) — 



lim ]hnmpJj-{(y t - V) - ^ C/ "' Pl {6 ! (p)<NJ-2fe»}} = °> 

, 7 n(p) / x 

lim lim sup Jf {(C- C) - £ r 2 (^, X a , {p) , < (p) = 0. 



In order to prove the analogue of ()2.7fl . we use a multivariate version of the result in 
[23j . The analogues of (|2,15p and (|2.16p are obtained in exactly the same way as for the 
one-dimensional result, and it is also quite simple to deduce 

A»(p) ~i 
lim limsup-^- ^ IE[(^ n ' p ) 2 l {fei(p )< Ln ( tiAtj )j_2fc„}] = / a 2 s l [0>t . At] (s)ds, 

p— too n—too K n Jq 

lim lim sup E [(^"' P ) 2r2 (^^»^^M» CT LM) 1 {^(p)<L«tiJ-2fcn}] 

p— i>oo K n ^— J Tl n „ 

-1 

,2 2/ ~2 



a s r (s,X s ,a s )l [0tti] (s)ds, 
limhmsup^^E[(C/;' p )V(^,X a , (p) ,< (p) )]= / a 2 r 4 (s, X s , a 2 )cfe, 



for arbitrary tj, ij. Proving that we have indeed stable convergence for each fixed p is just 
another tedious task. □ 
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n 


mean 


variance 


.025 


.05 


.1 


.9 


.95 


.975 


2500 


-0.129 


0.901 


.0098 


.0345 


.0961 


.9215 


.9561 


.9721 


10000 


-0.040 


1.020 


.0152 


.0395 


.0992 


.8974 


.9426 


.9665 


22500 


-0.005 


0.994 


.0180 


.0405 


.0906 


.8993 


.9424 


.9678 


40000 


0.024 


1.029 


.0184 


.0428 


.0952 


.8918 


.9446 


.9692 


52900 


0.061 


1.033 


.0193 


.0399 


.0911 


.8878 


.9380 


.9672 



Table 1: Mean/variance and simulated quantiles of the infeasible test statistic j4-l\ ) 
for p = 0. 



n 


mean 


variance 


.025 


.05 


.1 


.9 


.95 


.975 


2500 


-0.132 


0.931 


.0115 


.0358 


.0984 


.9195 


.9527 


.9724 


10000 


-0.048 


1.008 


.0153 


.0400 


.0950 


.9022 


.9457 


.9677 


22500 


-0.126 


0.928 


.0206 


.0463 


.1085 


.9221 


.9579 


.9793 


40000 


0.021 


0.995 


.0193 


.0423 


.0945 


.8959 


.9457 


.9717 


52900 


0.051 


1.027 


.0187 


.0434 


.0950 


.8907 


.9407 


.9675 



Table 2: Mean/variance and simulated quantiles of the infeasible test statistic f^, 1\ ) 
for p = 0.2. 



n 


mean 


variance 


.025 


.05 


.1 


.9 


.95 


.975 


2500 


-0.287 


0.965 


.0526 


.0932 


.1619 


.9572 


.9862 


.9965 


10000 


-0.170 


1.023 


.0449 


.0799 


.1425 


.9325 


.9757 


.9928 


22500 


-0.112 


1.002 


.0404 


.0696 


.1253 


.9271 


.9722 


.9914 


40000 


-0.073 


1.029 


.0401 


.0703 


.1235 


.9203 


.9690 


.9874 


52900 


-0.031 


1.022 


.0368 


.0653 


.1157 


.9154 


.9633 


.9872 



Table 3: Mean/variance and simulated quantiles of the feasible test statistic from 
Corollary \2. 7| for p = 0. 



n 


mean 


variance 


.025 


.05 


.1 


.9 


.95 


.975 


2500 


-0.295 


0.971 


.0552 


.0963 


.1614 


.9559 


.9864 


.9962 


10000 


-0.176 


1.013 


.0464 


.0808 


.1427 


.9369 


.9770 


.9940 


22500 


-0.226 


0.987 


.0480 


.0840 


.1476 


.9436 


.9776 


.9932 


40000 


-0.075 


1.001 


.0410 


.0673 


.1217 


.9254 


.9713 


.9904 


52900 


-0.040 


1.019 


.0396 


.0677 


.1171 


.9180 


.9663 


.9879 



Table 4: Mean/variance and simulated quantiles of the feasible test statistic from 
Corollary\%r7\for p = -0.2. 
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n 


.01 


.025 


.05 


.1 


.2 


2500 


.018 


.040 


.064 


.120 


.216 


10000 


.010 


.018 


.040 


.084 


.194 


22500 


.016 


.024 


.034 


.088 


.194 


40000 


.020 


.038 


.068 


.128 


.220 


52900 


.010 


.020 


.052 


.118 


.200 



Table 5: Simulated level of the bootstrap test based on the standardised Kolmogorov- 
Smirnov statistic Y n . 



alt 


7 = 


7 = 1 


n 


.01 


.025 


.05 


.1 


.2 


.01 


.025 


.05 


.1 


.2 


2500 


.028 


.052 


.082 


.134 


.262 


.044 


.090 


.156 


.248 


.372 


10000 


.032 


.048 


.086 


.138 


.260 


.036 


.084 


.176 


.284 


.396 


22500 


.024 


.042 


.068 


.138 


.302 


.032 


.086 


.162 


.284 


.432 


40000 


.028 


.046 


.094 


.196 


.426 


.028 


.064 


.120 


.310 


.482 


52900 


.026 


.040 


.082 


.174 


.422 


.024 


.058 


.144 


.320 


.488 



Table 6: Simulated rejection probabilities of the bootstrap test based on the standard- 
ised Kolmogorov-Smirnov functional statistic Y n for various alternatives. 



